Nucleotide, dinucleotide and trinucleotide frequencies explain patterns observed in chaos game representations of DNA sequences.
نویسنده
چکیده
The chaos game representation (CGR) is a scatter plot derived from a DNA sequence, with each point of the plot corresponding to one base of the sequence. If the DNA sequence were a random collection of bases, the CGR would be a uniformly filled square; conversely, any patterns visible in the CGR represent some pattern (information) in the DNA sequence. In this paper, patterns previously observed in a variety of DNA sequences are explained solely in terms of nucleotide, dinucleotide and trinucleotide frequencies.
منابع مشابه
Simulation for chaos game representation of genomes by recurrent iterated function systems
some pattern (information) in the DNA sequence Chaos game representation (CGR) of DNA (Goldman 1993). Goldman (1993) interpreted the sequences and linked protein sequences CGRs in a biologically meaningful way. All points from genomes was proposed by Jeffrey (1990) plotted within a quadrant must corresponding to suban d Yu et al. (2004), respectively. In this sequences of the DNA sequence that ...
متن کاملSecondary Structural Analysis of Families of Protein Sequences using Chaos Game Representation
CGR is an effective method for visualizing any structural features if it is given as a sequence of elements [1,2] analyzed by the genomic signature appears as a powerful tool for investigating the mechanisms of DNA maintenance from which the DNA structure results. It would be necessary to understand the patterns they exhibit and to be able to interpret them in a biologically meaningful way [3]....
متن کاملPrediction of Polyadenylation Signals in Human DNA Sequences using Nucleotide Frequencies
The polyadenylation signal plays a key role in determining the site for addition of a polyadenylated tail to nascent mRNA and its mutation(s) are reported in many diseases. Thus, identifying poly(A) sites is important for understanding the regulation and stability of mRNA. In this study, Support Vector Machine (SVM) models have been developed for predicting poly(A) signals in a DNA sequence usi...
متن کاملEncoding DNA sequences by integer chaos game representation
Motivation: DNA sequences are fundamental for encoding genetic information. The genetic information may be understood not only by symbolic sequences but also from the hidden signals inside the sequences. The symbolic sequences need to be transformed into numerical sequences so the hidden signals can be revealed by signal processing techniques. All current transformation methods encode DNA seque...
متن کاملDifferential distribution of simple sequence repeats in eukaryotic genome sequences.
Complete chromosome/genome sequences available from humans, Drosophila melanogaster, Caenorhabditis elegans, Arabidopsis thaliana, and Saccharomyces cerevisiae were analyzed for the occurrence of mono-, di-, tri-, and tetranucleotide repeats. In all of the genomes studied, dinucleotide repeat stretches tended to be longer than other repeats. Additionally, tetranucleotide repeats in humans and t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Nucleic acids research
دوره 21 10 شماره
صفحات -
تاریخ انتشار 1993